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(57) Abstract 

In digital video synchronisation, the trend in delay is monitored to enable a prediction to be made of the dropping or repeating of 
a video field. The audio time compression or expansion is then initiated a selected time interval in advance of the field the dropping or 
repeating. The loss of audio synchronisation can in this way be kept below perceptible thresholds. 
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WO 99/52298 , PCT/iGB99/01041 

IMPRQVEM^^ DELAY v ^^ -v^t- . 

This invention relates to the processing of audio and video signals. 
A common problem in the broadcast environment is the difference in 
delay experienced by the audio and video processes. With many new digital 
video processes in the signal chain and the consequential need for re- 

5 synchronisation, the delay of the video may often'differ significantly from that 
of the audio. This causes the well known lip-sync error problem. 

A video synchroniser operates by re-timing and. where necessary, 
either dropping or inserting fields^ Each video synchroniser may therefore 
add up to 40ms of video delay in a 50 field per second system such as PAL 

10 or 34ms in NTSC or other 60Hz systems. The precise delay will depend on 
often arbitrary system clocks. In new installations the audio is often 
"embedded" in the video channel and so when passing through the 
synchroniser it will experience the same delay as the video. 

Since a discontinuity of 40ms (or 34ms at 60Hz) is unacceptable in 

15 audio, a temporary loss of 'ip sync is inevitable. In order to ifecdver lip sync, 
the a*^^'^^ signal is tinie compressed or expanded for a period of time. This 
period of time must be sufficiently long that the resultant pitch change or other 
degradation remains imperceptible and will amount to several seconds. Over 
these several seconds^ the loss of lip sync can be seriously objectionable. 

20 It can be shown that a loss of synchronisation in which the audio 

arrives early, is particularly noticeable. It has been suggested that in the case 
of the audio being earlier than the video a delay much above 10ms is 
perceptible. In the opposite sense, with the audio being delayed, a delay of 
up to 30ms may be tolerated. 

25 It is an object of the present invention to provide an improved method 

of managing differential delay of digital audio and video signals. 

Accordingly, the present invention consists in one aspect in a digital 
video synchronisation process in which digital video arid audio signals are 
delayed by the same varying amount to ensure synchronisation with a timing 

30 reference and. on dropping or repeating of a video field, the audio signal is 
time compressed or expanded to recover audio/video synchronisation over a 
time period governed by the maximum acceptable pitch change or other 
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degradation^characteri in thaUhe trend Jnd^ 

orediction of the dropping or repeating of a video field and in that said time 
compression or expansion is initiated a selected time interval in advance 

thereof. . . , ^ , . 

5 . \ Jn one forrn of the invention, this time interval is selected to minimise 
the perceived loss of audio/video synchronisation. That is to say. regard is 
had to the asymmetry in the thresholds for perceived loss of synchronisation 
for advanced and retarded audio. 

The invention will now be described by way of example with reference 

10 to the accompanying drawing which is a block diagram of a digital video 
synchroniser according to the invention. 

The video synchroniser shown in the drawing receives a digital video 
input signal with embedded audio and provides as an output synchronised 
with an externally generated timing reference, a digital video signal, again 

15 with embedded audio. , _ ^ 

Digital video first passes into block 10; this extracts the audio data and 
the timing signals. Video passes to the main memory 12 to be delayed until it 
is co-timed with the output timing signal extracted by block 14 from the 
reference input. Block 16 measures the delay between the input and output 

20 timing signals and passes this figure tp a microprocessor 18. The calculated 
delay is passed to the audio data memory 20. The audio data and video data 
memories now give an identical delay. A re-sampling digital filter 22 alteris the 
audio data sampling rate to match the outgoing video in order that the audio 
data can be synchronously inserted by block 24 into the outgoing video data 

25 stream. 

A synchroniser such as this, when its input and output timing 
references are of a different frequency, as is usual, will occasionally drop or 
repeat a video frame. In effect, it gains or loses 40ms (at 50Hz). Whereas the 
resulting picture disturbance is often imperceptible, the same is not true for 
30 audio - it is not possible to cut or add 40ms of audio without major 
disturbance. 

A typical implementation will compensate by re-sampling the audio to a 
higher or lower frequency. In the case where the video memory has lost a 
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frame, the microprocessor 18 will initiate a process of reading'elftta' audio 
samples from the audio memory 20. A pitch change results, but if this is kept- 
to less than 1%; it is unlikely to be noticed Thus, after a synchroniser drops 
or repeats a frame the audio will initially be adrift by a noticeable 40ms. After 
5 8 seconds (assuming 0.5% pitch chahge)'the audio' and video will once again 
be co-timed. As a result, there will be a perceptible error for up to 6 seconds. 
In the improved system according to this invention, the microprocessor 
18 will continually monitor the video delay and calculate the rate of change of 
delay over time. With the highly stable timing references that are generally in 

10 use. it will usually be possible to predict accurately the time that the video 
memory will drop or gain a frame. Where it is predicted that, at the current 
rate of change, a video frame is due to be lost in 6 seconds time, the 
processor will initiate an increase in the audio delay to give a 0,5% pitch 
change. Just before the frame loss, the audio will be 30ms (at 50Hz) delayed 

15 with' respect to the video. If the above discussed thresholds for perception of 
an audio delay are correct, this loss in synchronisation is imperceptible. 
Immediately after the frame loss the audi^^^ If the above- 

discussed thresholds for perception of an audio advance are correct, this loss 
in synchronisation is similarly imperceptible. After a further 2 seconds, the 

20 audio and video will be co-timed. Of course, the pitch change and the 
balance between worst case advance and delay may be controlled by the 
user. Use of the invention has in this example and on the assumed 
perception thresholds, replaced a synchronisation error which is perceptible 
for up to 6 seconds, by a synchronisation error which is not perceptible at all. 

25 When it is predicted that, at the current rate of change, a video frame 

is due to be repeated in 2 seconds time the processor will initiate an decrease 
in the audio delay to give a 0.5% pitch change. Once again the error will be 
imperceptible. 

It will be understood that the invention has been described by way of 
30 examples only. Thus, rate conversion is only one example of a technique for 
time compression or expansion of the audio signal. Numerous alternative 
techniques, such as silence compression, will be known to the skilled reader 
and fall within the scope of the claimed invention. 
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CLAIMS ^ 




1 . A digital video synchronisation process in which digital video and audio 
signals are delayed by the same varying amount to ensure 
synchronisation with a timing reference and, on dropping or.repeating 
of a video field, the audio signal is time compressed or expanded to 



recover audio/video synchronisation over a time period governed by 
the maximum acceptable pitch change or other degradation, 
characterised in that the trend in delay is monitored to enable 
prediction of the dropping or repeating of a video field and in that said 
time compression or expansion is initiated a selected time interval in 
advance thereof. 

2. A process according to Claim 1 . wherein said time interval is selected 
to minimise the absolute loss of audio/video synchronisation. 

3. A process according to Claim 1 , wherein said time interval is selected 
to minimise the perceived loss of audio/video synchronisation. . 

4. A process according to Claim 3, wherein said time interval is selected 
such that the period for which the audio is delayed with respect to the 
video is greater than the period the period for which the audio is 
advanced with respect to the video. 
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